ugh an essential gene should not attract any transposon insertion

ally, a transposon may still be found in an essential gene because

l reasons [Lamichhane, et al., 2003; Deng, et al., 2013; Fels,

13]. For instance, the essentiality of a gene may not be disrupted

poson is inserted within the distal regions of a gene because the

gene product may still retain functionality. Often, the

ty of a gene may still maintain if a transposon is inserted in only

s of a gene. In addition to these biological reasons, there are also

nical reasons by which it is required to carefully examine the

on insertion pattern genome-wise to avoid missed discovery of

genes. First, the transposon sequencing technology is not yet

curate to distribute a transposon to the locations where it should

ut any bias. Second, aligning short transposon sequencing reads

ence genome will not be free of error. The shorter the sequencing

an alignment, the greater the error will be. Therefore, a few

s occurring in a gene may be caused by either error or both errors

tioned.

o the aforementioned reasons, there have been three updated

learning approaches for a robust gene essentiality pattern analysis

the high-throughput transposon sequencing technology. First, a

h a few insertion sites may be classified as an essential gene

ge, et al., 2009]. Second, a gene with a few insertions may also

fied as an essential gene [Zomer, et al., 2012]. Third, a gene with

poson insertions only in its distal regions may also be classified

ential gene [Yang, et al., 2017].

ensity estimation approaches have been exercised for discovering

genes for a long time. In the context of gene essentiality analysis,

principle of density estimation is to re-construct a density

which is unknown in advance, for a transposon statistic such as

poson insertions per gene statistic or the transposon insertion sites

statistic. Such a constructed density model is thus used for the

ation and prediction. The assumption is that such a density

should show a bimodal distribution, in which a cutting point can

ively determined to separate genes into two clusters. After a